Perceptually-based data-driven join costs: comparing join types
نویسندگان
چکیده
Unit selection synthesis has improved the quality of synthetic speech by making it possible to concatenate speech from a large database to produce intelligible synthesis while preserving much of the naturalness of the original signal. Such synthesis is by no means perfect, however, and this paper describes work to achieve more optimal joins between concatenated units. Results from a psychoacoustic experiment, acoustic parameters and phonetic factors are analyzed and used in statistical training of join costs so that audible discontinuities at concatenation boundaries can be minimized.
منابع مشابه
Perceptually-based Data-driven Join Co
Unit selection synthesis has improved the quality of synthetic speech by making it possible to concatenate speech from a large database to produce intelligible synthesis while preserving much of the naturalness of the original signal. Such synthesis is by no means perfect, however, and this paper describes work to achieve more optimal joins between concatenated units. Results from a psychoacous...
متن کاملData-driven perceptually based join costs
Concatenative speech synthesis systems attempt to minimize audible discontinuities between two successive concatenated units. In unit selection concatenative synthesis, a join cost is calculated that is intended to predict the extent of audible discontinuity introduced by the concatenation of two specific units. A study was conducted that used human perceptual data on the detectability of mid-v...
متن کاملNormalized laplacian spectrum of two new types of join graphs
Let $G$ be a graph without an isolated vertex, the normalized Laplacian matrix $tilde{mathcal{L}}(G)$ is defined as $tilde{mathcal{L}}(G)=mathcal{D}^{-frac{1}{2}}mathcal{L}(G)mathcal{D}^{-frac{1}{2}}$, where $mathcal{D}$ is a diagonal matrix whose entries are degree of vertices of $G$. The eigenvalues of $tilde{mathcal{L}}(G)$ are called as the normalized Laplacian eigenva...
متن کاملHigh Dimensional Similarity Joins: Algorithms and Performance Evaluation
ÐCurrent data repositories include a variety of data types, including audio, images, and time series. State-of-the-art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. In this paper, we study algorithms for finding relationshi...
متن کاملFeature transformation applied to the detection of discontinuities in concatenated speech
The quality of concatenated speech depends on the degree of mismatch between successive units. Defining a perceptually salient join cost to represent the degree of mismatch has proven to be a difficult task. Such a join cost is critical in unit selection synthesis to ensure that the optimum sequence of speech units is selected from the units available in the speech inventory. In this study the ...
متن کامل